Learn the SQL you need to make your application faster in our free SQL for Application Developers course.
Start learningForeign keys are an important element of any relational database. But when you’re setting up your database schema, it’s relatively easy to create problems for yourself if the foreign keys aren’t set up correctly.
Here are three of the most common mistakes made with foreign keys, and a little advice on how to avoid them!
When you’re using a foreign key to reference a column in another table, the datatypes of both tables have to be the same. For example, if the referencing column where you’re declaring the foreign key uses the INT
data type, then the referenced column must be INT
as well.
If you’re lucky, it’ll be easy to diagnose when you’ve made this mistake. You may get an error message such as Foreign key mismatch
or Could not create constraint
when you try to create the table.
However, depending on the database system you’re working with and the specific database schema, it can be easy to miss or misdiagnose a foreign key data type mismatch.
With a SQLite database, for example, it is possible to successfully create a table with a foreign key data type mismatch. This mismatch can then create other errors further down the line.
The easiest way to avoid this issue is to ensure that all columns linked to each other with foreign keys share the same data type when you’re creating the tables in the first place.
If you’re altering an existing table with a new foreign key constraint, your database system will likely return an error if you attempt to create a foreign key constraint that links columns with different data types.
A dangling foreign key is a foreign key that links to a nonexistent table or column. This is obviously bad practice, but it happens – mostly because it’s possible to create dangling foreign keys accidentally.
With some database management systems, you can create a dangling foreign key simply by creating the referencing table first, and including a foreign key constraint that links it to a column in the next table you’re planning to create. If you subsequently forget to create that referenced table or column, you’ve got a dangling foreign key.
Many database systems, including CockroachDB, prevent this by making it impossible to create a table that references a nonexistent table or column – if you try, the CREATE TABLE
statement will fail with an error such as ERROR: relation "referenced_table" does not exist
.
More commonly, dangling foreign keys crop up when you remove a table or column that was being referenced elsewhere in the database. Some database management systems will allow this, leaving a dangling foreign key, while others will throw errors. In CockroachDB, for example, attempting to drop a table that’s referenced by another table’s foreign key will result in the following error: ERROR: "table_1" is referenced by foreign key from table "table_2"
.
Other database systems may throw errors that are a bit tougher to parse. If you’re attempting to drop a column or table but repeatedly getting an error, it could be because you’re attempting to remove something that’s a foreign key constraint in another table.
The best way to avoid dangling foreign keys is simply to use a modern database system that can validate constraints when they’re added to a table, and that won’t allow users to break the database’s referential integrity by removing data that’s required by another table.
If upgrading to a database like CockroachDB isn’t possible, though, the next-best defense against dangling foreign keys is good documentation that highlights foreign key relationships, and careful planning, particularly if anything is being removed from the database.
This “mistake” is a bit different from the other two we’ve talked about. Not creating an index for a foreign key won’t produce an error message. However, failing to use foreign key indexes can mean you’re unintentionally taking a big performance hit, particularly when you’re operating at scale.
Foreign key indexes can significantly improve performance for queries that involve joins between the parent and child tables. How frequently your application performs these joins will vary, but in most cases the creation of a foreign key index is warranted because it will allow those queries to avoid triggering a full table scan. (Full table scans are slow, costly, and best avoided.)
The important thing to remember here is that while primary indexes are created automatically when you designate a table’s primary key, foreign key indexes typically must be created manually.
Don’t skip that step! Even if you’re working with a small enough database that the performance improvements don’t make a meaningful difference yet, it’s better to get indexes in place. Otherwise, as you scale and grow, what was a difference of a few milliseconds on your small test database can turn into lag that your users feel acutely on a production database at scale.
The biggest foreign key mistake of all is not using foreign keys, and instead relying on informal links between databases.
A quick example will be illustrative here: imagine our sales database has a table for leads
(individual people) and a separate table for companies
.
We could link those two tables informally by including the name of each lead’s company in a string somewhere, for example as part of a column job_title
that might have values like ’sales rep at Veridian Dynamics’
. We could join these two tables by, for example, running a query that checks each company name in companies
to see if the name exists as a substring in any of the rows of leads.job_title
.
That approach is inefficient, but more importantly, it makes it easy for errors to sneak into our data and everything that data is subsequently used for. Using a foreign key constraint helps us ensure that only data that can be linked to the referenced table can be entered into our database.
Imagine, to continue our example, that a user accidentally types a lead’s company name as Vridian Dynamics
. If we’ve created a leads.company
column with a foreign key constraint linking it to companies.name
, the user’s query will return an immediate error, and they’re likely to notice and fix their typo. If we’re just using the informal link described previously, the typo will be entered into the database, and a subsequent query of leads working at Veridian Dynamics would miss the new lead because of the typo.
Ultimately, most common foreign key mistakes are the result of sloppy planning. Taking a little extra time to plan out your database schema up front can help you avoid headaches later down the line.
Good questions to ask yourself in the planning stage include:
It always pays to spend extra time thinking about architecture and how your work will scale before you start building!
Most SQL content on the web seems to be written with data analysts in mind. And that’s fine, but developers need …
Read moreA foreign key is a column or columns in a database that (e.g. table_1.column_a
) that are …
NOT NULL
constraints are a ubiquitous tool for upholding data integrity. However, their performance benefits are less …